Reading data


Textinho explicando o que foi feito no primeiro notebook

[ ]:
FILE_PATH = 'resampled_files/RESAMPLED_0101086161_20070516T060226.csv'
[ ]:
import pandas as pd

data = pd.read_csv(FILE_PATH)
data.head()
DATE WHITEFLUX
0 54236.757582 112521.329834
1 54236.767013 112758.045853
2 54236.776445 112943.042225
3 54236.785876 112562.266242
4 54236.795308 112789.303079
[ ]:
import numpy as np

time = data.DATE.to_numpy()
flux = data.WHITEFLUX.to_numpy()
[ ]:
from utils import *
[ ]:
curve = lightcurve.LightCurve(time, flux)
curve.plot()

Ideal Lowpass Filter - Traduzir


Um filtro bi-dimensional passa-baixa que deixa passar todas as frequências em um círculo de raio \(D_0\) a partir da origem e remove todas as frequências fora desse círculo é chamado de filtro passa-baixa ideal (ILPF) e é descrito como

\[\begin{split} H(u) = \begin{cases} 1, &\text{if } D(u) \le D_0 \\ 0, &\text{if } D(u) \ge D_0 \end{cases}\end{split}\]

onde \(D_0\) é uma constante positiva, e \(D(u)\) é a distância entre um ponto \(u\) até o centro do retângulo de frequência, ou seja, é definido por

\[D(u) = (u-P/2) \qquad (1)\]

sendo \(P\) o tamanho do vetor original preenchido (padded).

O ponto de transição entre \(H(u) = 1\) e \(H(u) = 0\) é chamado de frequência de corte

[ ]:
def ideal_filter(array, fourier_transform, cutoff_freq):
  n_time = len(array)
  D0 = cutoff_freq * n_time

  for i in range(len(fourier_transform)):
    if fourier_transform[i] > D0:
      fourier_transform[i] = 0

  y_filtered = np.real(np.fft.ifft(fourier_transform))
  y_filtered += (array.mean() - y_filtered.mean())

  return y_filtered

Plotting some results

[ ]:
filtered = curve.ideal_lowpass_filter(0.1)
filtered.view_filter_results()
[ ]:
filtered = curve.ideal_lowpass_filter(0.6)
filtered.view_filter_results()

Saving filtered data

[ ]:
cutoff_freqs = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
[ ]:
PATH_DIR = 'C:/Users/guisa/Desktop/filters/ideal/cutoffFreq0'
[ ]:
%time

for cutoff_freq in cutoff_freqs:
   pass
   # Saving filtered data
   lightcurve.export_results_csv(WHERE_TO_SAVE_PATH=PATH_DIR+str(int(cutoff_freq*10)), filter_technique='ideal', cutoff_freq=cutoff_freq, order=None, numNei=None)
Wall time: 0 ns

Gaussian Lowpass Filter


The transfer function of a Gaussian 1-D lowpass filter (GLPFs) is defined by

\[H(u) = e^{-D^2(u)/2D^2_0}\]

where \(D(u)\) and \(D_0\) was defined on Eq. (1).

Note. The cutoff frequency must be given in Nyquist.

[ ]:
def gaussian_array(array, fourier_transform, cutoff_freq):
  # Extrating information of the signal
  n_time = len(array)
  D0 = cutoff_freq * n_time
  xc = n_time

  # Creating the filter array
  len_filter = len(fourier_transform)
  filter = np.zeros(len_filter)

  for i in range(len_filter):
    filter[i] = exp( (-(i-(xc-1.0))**2)/(2*((D0 * n_time)**2)) )

  return filter

Plotting some results - Completar

Note. It was observed that, at low cutoff frequencies (0.1 and 0.2 Nyquist), there is an effect in which the filtered curve undergoes a vertical displacement, as if a constant were added to the curve. This will not affect the modeling of the curves, but to prevent this effect from causing disturbances in the visualization of the results, the following operation is perfomed:

\[Flux\space Filtered\space += [mean(Raw\space Flux) - mean(Flux\space Filtered)]\]
[ ]:
filtered = curve.gaussian_lowpass_filter(cutoff_freq=0.1)
filtered.view_filter_results()
[ ]:
filtered = curve.gaussian_lowpass_filter(cutoff_freq=0.4)
filtered.view_filter_results()

Saving filtered data

[ ]:
cutoff_freqs = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
[ ]:
PATH_DIR = 'C:/Users/guisa/Desktop/filters/gaussian/cutoffFreq0'
[ ]:
%time

for cutoff_freq in cutoff_freqs:
   # Saving filtered data
   lightcurve.export_results_csv(WHERE_TO_SAVE_PATH=PATH_DIR+str(int(cutoff_freq*10)), filter_technique='gaussian', cutoff_freq=cutoff_freq, order=None, numNei=None)
Wall time: 0 ns

Butterworth Lowpass Filter


The transfer function of a Butterworth 1-D lowpass filter (BLPF) of order \(n\), and with cutoff frequency at a distance \(D_0\) from the origin, is defined as

\[H(u) = \frac{1}{ 1 + [D(u) / D_0]^{2n} }\]

where \(D(u)\) and \(D_0\) was defined on Eq. (1).

By the definition, the Butterworth filter have two free parameters: the cutoff frequency and the filtering order. Then, we can modify both, as we can see on the code cell below, intending to have the best results possibles.

Note. The cutoff frequency must be given in Nyquist.

[ ]:
def butterworth_array(array, fourier_transform, cutoff_freq, order):
  # Extrating information of the signal
  n_time = len(array)
  D0 = cutoff_freq * n_time
  xc = n_time

  # Creating the filter array
  len_filter = len(fourier_transform)
  filter = np.zeros(len_filter)

  for i in range(len_filter):
    filter[i] = 1.0 / (1.0+(abs(i-(xc-1.0))/D0)**(2.0*order))

  return filter

Plotting some results

[ ]:
filtered = curve.butterworth_lowpass_filter(2, 0.1)
filtered.view_filter_results()
[ ]:
filtered = curve.butterworth_lowpass_filter(4, 0.3)
filtered.view_filter_results()

Saving filtered data

[ ]:
orders = [1, 2, 3, 4, 5, 6]
cutoff_freqs = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
[ ]:
%time

for order in orders:
  for cutoff_freq in cutoff_freqs:
    # Saving filtered data
    PATH_DIR = 'C:/Users/guisa/Desktop/filters/butterworth/order'+str(order)+'/cutoffFreq0'+str(int(cutoff_freq*10))
    lightcurve.export_results_csv(WHERE_TO_SAVE_PATH=PATH_DIR, filter_technique='butterworth', cutoff_freq=cutoff_freq, order=order, numNei=None)
Wall time: 0 ns

Bessel Lowpass Filter


The transfer function, \(H(s)\), of a Bessel lowpass filter is defined by

\[\begin{split}H(s) = \frac{\theta_n (0) }{\theta_n (s/ \omega_0)} \\\end{split}\]

where

\[\theta_n (s) = \sum_{k=0}^{n} a_k s^k\]

and

\[\begin{split}a_k = \frac{(2n-k)!}{2^{n-k}k!(n-k)!} \\\end{split}\]

Showing how it works…

Parameters

[ ]:
order = 2
cutoff_freq = 0.6

Control lib

[ ]:
from control import *
[ ]:
### Computing ak

from math import factorial

coef = []
i = 0
while i <= order:
  ak = (factorial(2*order - i)) / ( 2**(order - i)*factorial(i)*factorial(order - i) )
  # print(ak)
  coef.append(ak)
  i += 1

print(coef)
[3.0, 3.0, 1.0]
[ ]:
### Computing θn(s)

s = TransferFunction.s
theta_array = []
k = 0
for k in range(order+1):
  theta_n = coef[k] * (s**k)
  theta_array.append(theta_n)
  # numerical_numerator = coef[0]
  # print(theta_n)

print(theta_array[0])
print(theta_array[1])
print(theta_array[2])

3
-
1


3 s
---
 1


s^2
---
 1

[ ]:
### Computing H(s)

coef_numerator = theta_array[0]

list_denominator = theta_array[:]
[ ]:
denominator = 0
for item in list_denominator:
  denominator += item

print(denominator)

s^2 + 3 s + 3
-------------
      1

[ ]:
### Filling in transfer function

G = coef_numerator / denominator

print(G)
print(type(G))

      3
-------------
s^2 + 3 s + 3

<class 'control.xferfcn.TransferFunction'>

Applying

[ ]:
def bessel(array, fourier_transform, cutoff_freq, order):

  # Extracting features from signal
  n_time = len(array)
  D0 = cutoff_freq * n_time
  xc = n_time

  # Creating the bessel transfer function array
  len_filter = len(fourier_transform)
  filter = np.zeros(len_filter)
  i=0

  for i in range(len_filter):
    filter[i] = np.real(evalfr(G, ( np.abs(i-(xc-1.0))/D0 )))

  return filter

Plotting some results

[ ]:
filtered = curve.bessel_lowpass_filter(2, 0.1, numExpansion=100)
filtered.view_filter_results()
[ ]:
filtered = curve.bessel_lowpass_filter(4, 0.3, numExpansion=100)
filtered.view_filter_results()

Saving filtered data

[ ]:
orders = [1, 2, 3, 4, 5, 6]
cutoff_freqs = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
[ ]:
%time

for order in orders:
  for cutoff_freq in cutoff_freqs:
    # Saving filtered data
    PATH_DIR = 'C:/Users/guisa/Desktop/filters/bessel/order'+str(order)+'/cutoffFreq0'+str(int(cutoff_freq*10))
    lightcurve.export_results_csv(WHERE_TO_SAVE_PATH=PATH_DIR, filter_technique='bessel', cutoff_freq=cutoff_freq, order=order, numNei=None)

Median Filter


O filtro mediano, por sua vez, é aplicado de uma forma consideravelmente diferente, na qual cada valor dos dados filtrados corresponde a uma mediana de um grupo de valores adjacentes nos dados originais. Esse filtro já é utilizado com certa frequência em análises de curvas de luz.

[ ]:
from scipy.signal import medfilt

def median_filter(array, window_size):
    return medfilt(array, window_size)

Reference: Scipy Documentation

Plotting some results

[ ]:
filtered = curve.median_filter(3)
filtered.view_filter_results()
[ ]:
filtered = curve.median_filter(9)
filtered.view_filter_results()

Saving filtered data

[ ]:
neighbors = [3, 5, 7, 9, 11]
[ ]:
%time

for neighbor in neighbors:
  # Saving filtered data
  PATH_DIR = 'C:/Users/guisa/Desktop/filters/median/numNei'+str(int(neighbor))
  lightcurve.export_results_csv(PATH_DIR, 'median', cutoff_freq=None, order=None, numNei=neighbor)
Wall time: 0 ns

API


Here, you can apply all the filtering processes, with whatever parameters you want, on whatever lightcurve

Select the Lightcurve:

RESAMPLED_0100725706_20070516T060226

RESAMPLED_0101086161_20070516T060226

RESAMPLED_0101206560_20070516T060226

RESAMPLED_0101368192_20070516T060050

RESAMPLED_0102671819_20071023T223035

RESAMPLED_0102671819_20120112T183055

RESAMPLED_0102708694_20071023T223035

RESAMPLED_0102708694_20120112T183055

RESAMPLED_0102725122_20071023T223035

RESAMPLED_0102725122_20120112T183055

RESAMPLED_0102764809_20071023T223035

RESAMPLED_0102890318_20070206T133547

RESAMPLED_0102912369_20070203T130553

RESAMPLED_0105118236_20100708T204534

RESAMPLED_0105209106_20080415T231048

RESAMPLED_0105228856_20100408T223049

RESAMPLED_0105793995_20080415T231048

RESAMPLED_0105819653_20080415T231048

RESAMPLED_0105833549_20080415T231048

RESAMPLED_0105891283_20080415T231048

RESAMPLED_0106017681_20080415T231048

RESAMPLED_0110839339_20081116T190224

RESAMPLED_0110864907_20081116T190224

RESAMPLED_0221686194_20081011T143035

RESAMPLED_0300001097_20081116T190224

RESAMPLED_0310247220_20090403T220030

RESAMPLED_0311519570_20090403T220030

RESAMPLED_0315198039_20100305T001525

RESAMPLED_0315211361_20100305T001525

RESAMPLED_0315239728_20100305T001525

RESAMPLED_0630831435_20110708T151253

RESAMPLED_0652180928_20110708T151253

RESAMPLED_0652180991_20110708T151253

[2]:
LIGHTCURVE = 'RESAMPLED_0102671819_20120112T183055'

Filtering processes availables:

  • Ideal (Cutoff frequency)

  • Gaussian (Cutoff frequency)

  • Butterworth (Order, Cutoff frequency)

  • Bessel (Order, Cutoff frequency, numExpansion=100)

  • Median (Number of neighbors)

[5]:
from utils import *
import pandas as pd
import numpy as np
[6]:
data = pd.read_csv('https://raw.githubusercontent.com/Guilherme-SSB/IC-CoRoT_Kepler/main/resampled_files/' + LIGHTCURVE + '.csv')
time = data.DATE.to_numpy()
flux = data.WHITEFLUX.to_numpy()

curve = lightcurve.LightCurve(time, flux)
curve.plot()
[ ]:
help(curve.how_to_filter)
Help on method how_to_filter in module utils.lightcurve:

how_to_filter(order: int, cutoff_freq: float, numNei: int, numExpansion: int) method of utils.lightcurve.LightCurve instance
    This function describes how to filtering using this library

    Parameters
    ----------
    order : int
        Used in Butterworth and Bessel filtering. Matches the filter order.

    cutoff_freq : float
        Used in Ideal, Gaussian, Butterworth and Bessel filtering. Matches the cutoff frequency.

    numNei : int
        Used in Median. Matches the number of neighbors to consider

    numExpansion : int
        Used in all processes. Corresponds to how much you want to expanded the curve's edges
        (to avoid some problems caused by the Fast Fourier Transform algorithm).
        Preliminary tests show that all processes works fine with numExpansion=70,
        except for Bessel filtering which required numExpansion=100


    Methods
    -------
    curve.ideal_lowpass_filter(cutoff_freq)

    curve.gaussian_lowpass_filter(cutoff_freq)

    curve.butterworth_lowpass_filter(order, cutoff_freq)

    curve.bessel_lowpass_filter(order, cutoff_freq)

    curve.median_filter(numNei)


    Examples
    --------
    >>> from utils import *
    >>> curve = lightcurve.Lightcurve(time=[1, 2, 3, 4], flux=[10, 15, 12, 20])
    >>> print(curve)
    <utils.lightcurve.LightCurve object>
    >>> curve.plot()

    >>> filtered = curve.bessel_lowpass_filter(order=3, cutoff_freq=0.4, numExpansion=100)
    >>> print(filtered)
    <utils.lightcurve.FilteredLightCurve object>

    >>> filtered.view_filter_results()

[ ]:
filtered = curve.gaussian_lowpass_filter(cutoff_freq=0.4)
filtered.view_filter_results()